35 research outputs found

    Rumour stance and veracity classification in social media conversations

    Get PDF
    Social media platforms are popular as sources of news, often delivering updates faster than traditional news outlets. The absence of verification of the posted information leads to wide proliferation of misinformation. The effects of propagation of such false information can have far-reaching consequences on society. Traditional manual verification by fact-checking professionals is not scalable to the amount of misinformation being spread. Therefore there is a need for an automated verification tool that would assist the process of rumour resolution. In this thesis we address the problem of rumour verification in social media conversations from a machine learning perspective. Rumours that attract a lot of scepticism in the form of questions and denials among the responses are more likely to be proven false later (Zhao et al., 2015). Thus we explore how crowd wisdom in the form of the stance of responses towards a rumour can contribute to an automated rumour verification system. We study the ways of determining the stance of each response in a conversation automatically. We focus on the importance of incorporating conversation structure into stance classification models and also identifying characteristics of supporting, denying, questioning and commenting posts. We follow by proposing several models for rumour veracity classification that incorporate different feature sets, including the stance of the responses, attempting to find the set that would lead to the most accurate models across several datasets. We view the rumour resolution process as a sequence of tasks: rumour detection, tracking, stance classification and, finally, rumour verification. We then study relations between the tasks in the rumour verification pipeline through a joint learning approach, showing its benefits comparing to single-task learning. Finally, we address the issue of transparency of model decisions by incorporating uncertainty estimation methods into rumour verification models. We then conclude and point directions for future research

    QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions

    Full text link
    This paper describes the participation of the QMUL-SDS team for Task 1 of the CLEF 2020 CheckThat! shared task. The purpose of this task is to determine the check-worthiness of tweets about COVID-19 to identify and prioritise tweets that need fact-checking. The overarching aim is to further support ongoing efforts to protect the public from fake news and help people find reliable information. We describe and analyse the results of our submissions. We show that a CNN using COVID-Twitter-BERT (CT-BERT) enhanced with numeric expressions can effectively boost performance from baseline results. We also show results of training data augmentation with rumours on other topics. Our best system ranked fourth in the task with encouraging outcomes showing potential for improved results in the future

    Stance classification in rumours as a sequential task exploiting the tree structure of social media conversations

    Get PDF
    Rumour stance classification, the task that determines if each tweet in a collection discussing a rumour is supporting, denying, questioning or simply commenting on the rumour, has been attracting substantial interest. Here we introduce a novel approach that makes use of the sequence of transitions observed in tree-structured conversation threads in Twitter. The conversation threads are formed by harvesting users’ replies to one another, which results in a nested tree-like structure. Previous work addressing the stance classification task has treated each tweet as a separate unit. Here we analyse tweets by virtue of their position in a sequence and test two sequential classifiers, Linear-Chain CRF and Tree CRF, each of which makes different assumptions about the conversational structure. We experiment with eight Twitter datasets, collected during breaking news, and show that exploiting the sequential structure of Twitter conversations achieves significant improvements over the non-sequential methods. Our work is the first to model Twitter conversations as a tree structure in this manner, introducing a novel way of tackling NLP tasks on Twitter conversations

    Learning disentangled latent topics for Twitter rumour veracity classification

    Get PDF
    With the rapid growth of social media in the past decade, the news are no longer controlled by just a few mainstream sources. Users themselves create large numbers of potentially fictitious rumours, necessitating automated veracity classification systems. Here we present a novel approach towards automatically classifying rumours circulating on Twitter with respect to their veracity. We use a model built on Variational Autoencoder which disentangles the informational content of a tweet from the manner in which the information is written. This is achieved by obtaining latent topic vectors in an adversarial learning setting using the auxiliary task of stance classification. The latent vectors learnt in this way are used to predict rumour veracity, obtaining state-of-the-art accuracy scores on the PHEME dataset
    corecore